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ABSTRACT 

A linite element solution to the convecled Helmholtz equation in a non uniform flow is used to 
model the noise field within 3-D acoustically treated aero-engine nacelles. Options to select linear 
or cubic Ilermile polynomial basis functions and isoparametric elements are included. However, 
the key feature of the method is a domain decomposition procedure that is based upon the inter- 
mixing of an iterative and a direct solve strategy for solving the discrete finite clement equations. 
This procedure is optimized to take full advantage of sparsity and exploit the increased memory 
and parallel processing capability of modern computer architectures. Lixamplc computations arc 
presented for the Langley Flow Impedance Test Facility and a rectangular mapping of a full scale, 
generic aero-engine nacelle. The accuracy and parallel performance of this new solver are tested on 
both model problems using a supercomputer that contains hundreds of central processing units. Re- 
sults show that the method gives extremely accurate attenuation predictions, achieves super-linear 
speedup over hundreds of CPUs, and solves upward of 25 million complex equations in a quarter of 
an hour. 


1 NOMENCLATURE 
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= sound speed, mean density, mean velocity vector 
= vector containing source effects, vector of nodal velocity potentials 
= source frequency, unit imaginary' number, order of global stiffness matrix 

- acoustic intensity, acoustic power, liner attenuation 

= stiffness matrix, dimensionless node impedance matrix 
= unit normal vector, 3-D gradient operator 

- acoustic pressure, velocity potential, particle velocity vector 

- wall impedance normalized by pof'o 

- vectors for defining nonreflecting boundary condition 

- real part of complex expression, complex conjugate, dot product 

- source boundary, exit boundary 
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Subscripts: 

/, li, s ~ interior unknown, boundary unknown, source potential 

2 INTRODUCTION 

The reduction of commercial aircraft noise in communities near airports has become a major 
socio-economic problem both within the U.S. and abroad. Because of the continuous increase in 
aircraft capacity and the number of flights, commercial air transportation remains in continuous 
growth. However, the high noise levels emitted during takeoff and landing threaten to severely 
compromise the growth of commercial air transportation systems worldwide. Thus, the reduction 
of aircraft noise especially in communities near airports has become a major challenge. High levels 
of noise produced by modern aircraft at take-off and landing may be categorized as cither airframe 
or engine noise. Airframe noise is due to the interactions of the flow with the solid aircraft compo- 
nents. Engine noise is generally decomposed into its constitutive sources, two main components 
bci ng jet and fan noise. The jet noise is caused by the ejection of fast hot gases through the engine 
and fan noise is fluid/slruclurc interaction noise generated by the rotating turbomachinery within 
the engine. The introduction of high-bypass ratio engines has enabled a substantial reduction of jet 
noise so that engine noise in today’s large civil aircraft is dominated by fan noise [1]. 

Noise reduction methods for fan noise have mainly involved the installation of advanced nacelle 
liners within the nacelles to absorb the noise generated by the fan noise sources [2], This approach 
requires an accurate prediction of both the noise absorbed and radiated from modern nacelles 
so that the treatment can be optimized for maximum noise reduction. Due to the complex 3-D 
geometries and flows inside modern nacelles, such predictions remain out-of-reach of theoretical 
modeling and experimental methods have proved too cosily. Thus, the tool of choice has been 
numerical simulation. To this end, several 3-D finite element codes have been developed both 
in the U.S. and abroad 13, 4], These codes generate a large, sparse, linear system of algebraic 
equations that must be solved in an efficient manner to compute the radiated noise and provide the 
ability to assess various low-noise designs. 

Methods for solving large sparse systems of linear algebraic equations are either direct or 
iterative. Iterative methods in use today are usually based upon Krylov subspace methods [5], 
The iterative methods have the advantage of not requiring computation and storage of an inverse 
matrix and are highly scalable on massively parallel supercomputers. However, the convergence 
rates of the iterative methods are highly dependent on the existence of good preconditioners that 
arc currently not available for nacelle problems with arbitrary 3-D geometries and wall lining. 
Consequently, the more robust direct methods have generally been the solvers of choice in nacelle 
acroacouslics |3, 4). Because direct methods arc based upon the factorization and storage of a 
matrix inverse, they arc inefficient when used on realistic 3-D geometries. 

In an earlier paper, the authors introduced a “hybrid” solve strategy for analyzing acoustic fields 
in aero-engine ducts [6]. This approach has the potential to overcome many of the limitations of 
iterative and direct methods. However, the work in the earlier paper contained several shortcom- 
ings. First, it w'as tested only in a hard wall duct and at a single frequency. Second, the earlier 
work was restricted to symmetric matrices so that mean flow' could not be accommodated. Third, 
the developed software was limited to a million equations and to only 64 central processing units 
(CPUs). Finally, the analysis was restricted to a uniform brick element with linear basis functions 
that limited its applicability to a rectangular geometry. The purpose of this paper is to describe 
efforts to remove the above-mentioned limitations of the earlier work. 
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3 GOVERNING EQUATIONS AND BOUNDARY CONDITIONS 
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Fig 1: Typical aircraft nacelle with acoustic liners. 


Figure 1 is a schematic of an aircraft nacelle with sound absorbing material (acoustic liners). 
The sound absorbing material is locally reacting and is characterized by an impedance, £, that is a 
function of position along the treated surfaces. The problem at hand is to determine the attenuation 
produced by the wall lining in the presence of a flowing fluid in the duct. The governing differential 
equation (assuming irrolalional homcntropic flow) is [4] 
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Near the fan face, the acoustic velocity potential is assumed known 


cj> = <k 


( 2 ) 


The wall liner boundary condition is expressed in the form [7] 
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At the exit of the nacelle a nonlocal, nonreflecting boundary condition [8] that has been extended 
by the authors to include flow effects is implemented 


P E = %bUe 


(4) 
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Finally, at the intersections between hard and treated surfaces the acoustic pressure and normal 
component of acoustic particle velocity arc required to be continuous. 

Upon obtaining the acoustic velocity potential, the acoustic particle vcloci ty vector and acoustic 
pressure field arc post-processed from the irrotationality and the linearized, homcnlropic, condi- 
tions 

= p = ( 5 ) 

The sound attenuated by the wall lining in decibels is obtained from the log of the ratio of the input 
to the output acoustic power 


MB = 101og 10 


PP{Sl) 


, PP{S) = j Ids 


[pp(m 

where the acoustic intensity, /, is expressed in the form given by Morfey [9] 


/ = -5ft 
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4 THE FINITE ELEMENT MODEL 

The numerical method chosen to solve the potential equation coupled with the sound source, 
wall impedance, and exit boundary conditions is the finite clement method (FEM). Details of the 
FEM arc beyond the scope of the current paper. However, a brick clement containing cither linear 
or cubic Hcrmilc polynomial basis functions is utilized to obtain the solution for the acoustic 
potential within the nacelle. For nonuniform geometries the brick clement is transformed to an 
isoparametric element. Galerkin’s finite-element method is used to minimize the field error and 
obtain the acoustic velocity potential. A weak formulation is introduced so that the wall and exit 
impedance boundary conditions are introduced at the element level. The elements for the entire 
domain are assembled in the usual manner and the source condition is satisfied by constraining the 
nodal degrees of freedom at the source plane. This leads to a sparse system containing N complex, 
linear, algebraic equations of the form 

KO = F (8) 


5 THE HYBRID EQUATION SOLVER 

The equation solving paradigm introduced in this text is based upon the intermixing of a direct 
sparse solver and an iterative solver. The key idea is to combine the benefits of direct sparse 
solvers (robustness and speed) and iterative solvers (low memory usage and scalability) to obtain 
the solution to Eq. (8) in a memory and time efficient manner. The hybrid solve strategy begins by 
partitioning Eq. (8) into interior and boundary unknowns 1 1 1 1 

Kbb®b 4- Kfl/O/ = Fn 

KibO b + K u <S>i = F/ ' } 


Solutions to Eq. (9) are of the form 


KbbQb = Fa 

Kji^i = Fj — K[b<&b 
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where 


Kbb = Kbb — KbjKj y 1 AT/a 

= Fq — KbiFi (11) 

K n Fj = Fj 

The sequence of steps constituting the hybrid DD formulation proposed in this text arc 

1 . Compute Kib, Kbi,Kh, Kbb, Fb, and Fj using efficient sparse assembly algorithms [1()| 

2. Factorize the sparse matrix Kn and compute F/ from Eqs. (11) using algorithms and software 
discussed in [10]. 

3. Compute Kbb and Fb using Eqs. (11). Explicit computation of Kbb will be expensive due to 
the need to perform the triple product in Eq. (11) 

4. Use an iterative solver to obtain the boundary unknowns in Eq. ( 10) and avoid the computa- 
tion of this triple product. 

5. Upon obtaining the boundary unknowns the interior unknowns are obtained from Eq. (10) 
by using the previously factorized sparse matrix Ku 

6. To better facilitate application of the strategy on massively parallel computer architectures, 
a domain decomposition (DD) formulation [12, 13] is applied to the computational volume. 
Equations (10)-(1 1) are therefore subdivided into subdomains and each subdomain assigned 
to a processor. After each processor has completed its task, the solutions are merged to 
obtain the solution vector. The solution vector is then post-processed to obtain the sound 
attenuation, A dli. 


6 RESULTS AND DISCUSSIONS 

This section presents results for zero flow, as well as selected examples with flow. The hybrid 
solver uses sparse factorization techniques presented in the book of Nguyen [ 1()[ and implements 
cither the conjugate gradient (symmetric matrices) or the Generalized Minimal Residual (asym- 
metric matrices) as the iterative solver [5| for the dense system defined in Eq. (10). Both iterative 
solvers arc implemented with the diagonals of Kbb as the preconditioned Therefore a less expen- 
sive preconditioner than that provided by the Jacobi preconditioner has been implemented. The 
primary hardware utilized w'as the Columbia cluster (a Silicon Graphics Altix 3700 distributed 
memory system w'ith 1 TB of RAM and 512 Itanium2 CPUs with clock speeds of 1.5 GHz). The 
developed software is referred to as the direct iterative parallel sparse solver (DIPSS). The au- 
thors have examined a number of freely available and at least two commercially available parallel, 
sparse, direct solvers against which to benchmark the speed of the new hybrid solve strategy. It was 
determined that the commercially available SGI parallel sparse solver has a lower wall clock lime 
than the other solvers on the Columbia clusters. Thus, the decision was made to benchmark the 
hybrid solver against the SGI parallel, direct, sparse solver. It should be noted that the lower wall 
clock time of the SGI sparse solver may reflect the fact that this solver was specifically designed 
to take full advantage of many special features of Columbia’s hardware. Further, some of the com- 
mercially available software packages were limited to only 32 CPUs. Our comparison study w'ith 
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other direct solvers was restricted therefore to only 32 CPUs. It should also be noted that many of 
the solvers tested were limited in the number of reordering schemes. 

Two rectangular geometries arc used to benchmark the accuracy, efficiency, and robustness of 
the DIPSS solution methodology presented in this paper. The first geometry is that of the LaRC 
Flow Impedance Test Facility and the second geometry is a rectangular mapping of a generic aero- 
engine duct. The Langley Flow Impedance Test Facility geometry was chosen because it affords 
the authors the opportunity to compare solution statistics to that of the SGI parallel sparse solver 
and with the version of the DIPSS solver presented in an earlier paper [6]. The generic aero- 
engine duct was chosen because it allows for large-scale computations in geometries with short 
length to diameter ratios (comparable to that of a large commercial engine) where many modes are 
propagating and tens of millions of grid point are required for accurate resolution of the sound field. 
Only uniform rectangular elements arc implemented due to the rectangular geometries involved. 
Furthermore, the iterative solve portion of the DIPSS software was run until the L2 norm of the 
residual reached a specified tolerance or until 2,000 iterations were reached. The tolerance was set 
at 10 -8 for all results presented in this section. 

6.1 LaRC Flow Impedance Test Facility 

This geometry is 81.28 cm long and contains a 5.08 cm x 5.08 cm cross section. We consider 
zero flow (Mb = 0.0) and a 36 x 36 x 775 uniform grid (N = 1 , 004, 440) . Hard wall statistics for the 
SGI sparse solver arc compared to that of an earlier version (DIPSS V 1) and to the current version 
(DIPSS V2) of the DIPSS software in table 1. Standard atmospheric conditions (po=1.2 kg-m -3 
and f‘o-344 m/s) were used to perform the computations. 

Table 1: Hard Wall Statistics for LaRC Flow Impedance Test Facility. 

(Zero flow, hard walls, /= 3.5 kHz, N = 1,004,440) 



Wall Clock Time, sec 

RAM Memory, GB 

CPUs 

SGI 

DIPSS VI 

DIPSS V2 

SGI 

DIPSS V2 

1 

1428 

N/A 

N/A 

12.01 

N/A 

2 

751 

4880 

N/A 

12.01 

N/A 

4 

400 

1766 

N/A 

12.01 

N/A 

8 

242 

432 

402 

12.02 

11.04 

16 

377 

146 

133 

: 12.09 

9.19 

32 

150 

60 

55 

12.66 

6.89 

64 

185 

33 

35 

12.29 

10.62 

128 

192 

89 

70 

13.00 

12.76 

256 

580 

155 

146 

13.00 

12.86 


Table 1 exemplifies the primary problem encountered by direct solvers (such as the SGI solver) 
for even moderate size acoustic problems. The SGI solver (column 2) leads to low wall clock 
turnaround times for a small number of CPUs, but scales poorly as the number of CPUs is in- 
creased. DIPSS (columns 3 and 4) gives super-linear speedup over 64 CPUs and is considerably 
faster than the SGI solver on 128 and 256 CPUs. Notice that the wall clock time for DIPSS is 
considerably less than the direct solver on as little as 16 CPUs. DIPSS gives a reduction of a factor 
of six in wall clock turnaround compared to the SGI solver on 64 CPUs. Observe also that more 
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than four CPUs are required to obtain a solution using the current strategy, whereas, for the earlier 
version this same example could be solved on as little as two CPUs. Thus, the current implementa- 
tion of DIPSS (DIPSS V2) requires more startup memory, but runs a little more efficiently than the 
earlier version (DIPSS VI). The corresponding RAM for the SGI solver and DIPSS V2 arc given 
in columns 5 and 6, respectively. Note that DIPSS gives savings also in the RAM memory com- 
pared with the SGI solver. RAM memory savings arc observed to be considerable in the middle of 
the CPU range. 

Table 2: Attenuations and Solver Statistics for LaRC Flow Impedance Test Facility. 

(Zero flow, N = 1,004,440, 16 CPUs used) 



Hard Wall Statistics 

Soft wall duct statistics 

f 

Anal 

DIPSS 

Wall 

No. 

Anal 

DIPSS 

Wall 

No. 

kHz 

MB 

MB 

Clock 

Iter 

MB 

MB 

Clock 

Iter 

0.5 

0.00 

0.00 

147 

57 

34.75 

34.71 

154 

79 

1.0 

0.00 

0.00 

144 

55 

41.96 

41.90 

153 

77 

1.5 

0.00 

0.00 

145 

51 

45.65 

45.58 

152 

75 

2.0 

0.00 

0.00 

145 

51 

47.60 

47.53 

151 

72 

2.5 

0.00 

0.00 

140 

38 

48.13 

48.05 

149 

67 

3.0 

0.00 

0.00 

135 

26 

47.19 

47.11 

148 

63 

3.5 

0.00 

0.00 

130 

15 

44.62 

44.55 

142 

48 

4.0 

0.00 

0.00 

134 

23 

40.46 

40.38 

143 

50 

4.5 

0.00 

0.00 

141 

41 

35.20 

35.13 

145 

58 

5.0 

0.00 

0.00 

137 

31 

29.76 

29.70 

159 

72 


To validate solution accuracy of the DIPSS solution over a frequency range, the authors used the 
analytically computed attenuation produced by the liner, A dB, as a metric. This metric is physically 
more meaningful than the acoustic potential because the human car perceives it as the noise source 
propagates down the duct. In addition, the difference between the analytical and numerical values 
provides some metric for the assessment of error in the calculations. Attenuations computed from 
the DIPSS V2 solution vectors are compared to the analytical values for a frequency range of 0.5 to 
5.0 kHz in table 2. In addition, the wall clock time in seconds and the number of iteration required 
for the iterative solver to converge is also presented. Here, the hard wall duct has a planar wave 
source, the soft wall duct uses the lowest order mode as the sound source, and the liner is 81.28 
cm in length and has a uniform impedance of ^-1.5 -0.5/. The impedance of the upper and two 
sidewalls arc set to rigid wall values. As expected, no attenuation of the sound is obtained in the 
rigid wall duct (table 2) due to the absence of the wall treatment. The DIPSS attenuations for the 
haid wall duct arc in exact agreement with the analytical values of zero. Note that the presence 
of the liner leads to an attenuation of the sound and that the frequency of peak attenuation is 2.5 
kHz. Attenuations computed from the DIPSS V2 solution vector are in excellent agreement with 
the analytical value in the lined duct. When compared to the wall clock time without lining, the 
effects of the wall lining are to increase the wall clock time only slightly. Just as in the rigid wall 
duct, the wall clock times that are required for a converged solution are nearly constant across the 
frequency range. 

Tabic 3 compares the analytical attenuation with that obtained using the solution vector from 
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Tabic 3: Attenuations for LaRC Flow Impedance Test Facility at Mach 0.45. 
(Soft wall, cubic Hermite element, N = 595,968, 8 CPUs used) 


f 

kHz 

Anal 

AdH 

SGI 

AdH 

0.5 

13.06 

13.01 

1.0 

17.24 

17.18 

1.5 

19.33 

19.26 

2.0 

20.67 

20.59 

2.5 

21.62 

21.54 

3.0 

22.33 

22.26 

3.5 

22.87 

22.79 

4.0 

23.24 

23.16 

4.5 

23.45 

23.39 

5.0 

23.53 

23.46 


the SGI solver for a flow Mach number of 0.45 in the soft wall duct. Here, the 3-D cubic Hermite 
element is used with a 16x 16x291 uniformly spaced grid. This grid is considerably coarser than 
that used with the linear element in table 2. However, with the higher order cubic element the SGI 
attenuations arc still in excellent agreement wi th the analytical values. 

6.2 Generic Aero-Engine Duct 

The generic aero-engine duct is modeled as a rectangular duct by cutting it along the axis 
and unwrapping it into a rectangular geometry'. When unwrapped, the nacelle engine duct has a 
317.5 cm x 63.5 cm rectangular cross-section and is 219.5 cm in length. Thus, the volume of 
our generic aero-engine duct is slightly more than 2,075 times that of the Flow Impedance Test 
Facility investigated in the previous example, and requires many more grid points for accurate 
resolution of the acoustic field. The highest frequency of interest (5.0 kHz) is roughly equivalent 
to four to six times the blade passage frequency (BPF) fora typical large commercial engine. Just 
to illustrate the capability of the hybrid solver we have used a lOOx 100x2501 uniformly spaced 
grid ( N = 25,010,000). Such a large number of points arc far beyond w'hat can be solved using 
direct sparse solvers such as the SGI solver. 

Table 4 compares the analytieal and DIPSS attenuations and gives solver statistics for the 
generic aero-engine duct. All parameters are identical to those of table 2 with the exception of 
the duct dimensions and the grid. Speedup studies were conducted on the generic aero-engine 
duct but because of space limitations these results could not be presented. However, super-linear 
speedup w'as observed on as many as 256 CPUs and this speedup drops to linear on 384 CPUs. 
Results in table 4 w'ere run in parallel on 192 CPUs and required slightly more than 333 GB of 
RAM. Hard wall attenuations in the generic aero-engine duct are in excellent agreement with the 
analytical values of zero over the full range of source frequencies. It is observed that in this larger, 
more realistic volume, the wall clock turnaround and the number of iterations required to obtain 
a converged solution in the hard wall duct are essentially constant across the full range of source 
frequencies. A more interesting set of results is obtained in the soft wall duct. Note that the chosen 
lining is not very effective at attenuating the sound and the wall clock times arc considerably higher 
than the rigid wall case. The predicted soft wall attenuations are in excellent agreement with the 
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Tabic 4: Attenuations and Solver Statistics for Generic Aero- Engine Duel. 
(Zero now, N = 25,010,000, 192 CPUs used) 



Hard Wall Statistics 

Soft wall duct statistics 

f 

Anal 

DIPSS 

Wall 

No. 

Anal 

DIPSS 

Wall 

No. 

kHz 

MB 

MB 

Clock 

Iter 

MB 

MB 

Clock 

Iter 

0.5 

0.000 

0.000 

1048 

208 

4.073 

4.071 

1695 

427 

1.0 

0.000 

0.000 

1089 

218 

0.920 

0.920 

1933 

498 

1.5 

0.000 

0.000 

1060 

209 

0.393 

0.393 

1921 

492 

2.0 

0.000 

0.000 

1060 

209 

0.217 

0.217 

1928 

494 

2.5 

0.000 

0.000 

1060 

210 

0.137 

0.137 

2105 

537 

3.0 

0.000 

0.000 

1059 

210 

0.094 

0.094 

3491 

985 

3.5 

0.000 

0.000 

1045 

211 

0.069 

0.069 

6810 

2,000 

4.0 

0.000 

0.000 

1027 

211 

0.053 

0.052 

7047 

2,000 

4.5 

0.000 

0.000 

1035 

212 

0.041 

-0.275 

6829 

2,000 

5.0 

0.000 

0.000 

1016 

213 

0.033 

1.139 

6841 

2,000 


analytical attenuations up to about 4.0 kHz. However, beyond 4.0 kHz the solution vector in the 
soft wall duel had not converged within the 2,000 iteration limit and the predicted attenuations arc 
poor. This suggests that a better preconditioner is required for frequencies that arc larger than 4.0 
kHz in the soft wall duel. 

6.2 CONCLUSIONS 

The results of this study may be summarized as follows: 

1 . When compared to analytical solutions, the hybrid solve strategy gives extremely accurate 
attenuation predictions in rigid and soft wall duels for the range of frequencies of interest in 
full-scale aero-engine nacelles. This accuracy can be obtained using a preconditioner less 
expensive than that provided by the Jacobi preconditioner. 

2. In contrast to direct solve strategics, the hybrid solve strategy gives super-linear speedup 
over hundreds of processors and allows for upward of 25 million complex unknowns to be 
solve in slightly more than a quarter of an hour. 

3. In addition to significant increases in speedup compared to the commonly used direct sparse 
solver, the hybrid solve philosophy leads to significant reduction in RAM memory. 

4. Results of this study show that for full-scale aero-engine modeling in lined ducts, that the 
hybrid solver convergence rate is slow for source frequencies above 4.0 kHz.. Thus, an im- 
proved preconditioner appears needed for noise computations beyond this source frequency. 

The above conclusions arc based upon the use of a rectangular geometry for which exact analyt- 
ical solutions arc available for comparison. A similar study involving nonuniform geometry and 
nonuniform mean flow for which exact solutions are not available is currently underway. 
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